68 research outputs found

    CiThruS2 : Open-source Photorealistic 3D Framework for Driving and Traffic Simulation in Real Time

    Get PDF
    The automotive and transport sector is undergoing a paradigm shift from manual to highly automated driving. This transition is driven by a proliferation of advanced driver assistance systems (ADAS) that seek to provide vehicle occupants with a safe, efficient, and comfortable driving experience. However, increasing the level of automation makes exhaustive physical testing of ADAS technologies impractical. Therefore, the automotive industry is increasingly turning to virtual simulation platforms to speed up time-to-market. This paper introduces the second version of our open-source See-Through Sight (CiThruS) simulation framework that provides a novel photorealistic virtual environment for vision-based ADAS development. Our 3D urban scene supports realistic traffic infrastructure and driving conditions with a plurality of time-of-day, weather, and lighting effects. Different traffic scenarios can be generated with practically any number of autonomous vehicles and pedestrians that can be made to comply with dedicated traffic regulations. All implemented features have been carefully optimized and the performance of our lightweight simulator exceeds 4K (3840 Ă— 2160) rendering speed of 60 frames per second when run on NVIDIA GTX 1060 graphics card or equivalent consumer-grade hardware. Photorealistic graphics rendering and real-time simulation speed make our proposal suitable for a broad range of applications, including interactive driving simulators, visual traffic data collection, virtual prototyping, and traffic flow management.acceptedVersionPeer reviewe

    uvgVenctester: Open-Source Test Automation Framework for Comprehensive Video Encoder Benchmarking

    Get PDF
    acceptedVersionPeer reviewe

    Design Space Exploration of Practical VVC Encoding for Emerging Media Applications

    Get PDF
    Versatile Video Coding (VVC/H.266) is the latest video coding standard designed for a broad range of next-generation media applications. This paper explores the design space of practical VVC encoding by profiling the Fraunhofer Versatile Video Encoder (VVenC). All experiments were conducted over five 2160p video sequences and their downsampled versions under the random access (RA) condition. The exploration was performed by analyzing the rate-distortion-complexity (RDC) of the VVC block structure and coding tools. First, VVenC was profiled to provide a breakdown of coding block distribution and coding tool utilization in it. Then, the usefulness of each VVC coding tool was analyzed for its individual impact on overall RDC performance. Finally, our findings were elevated to practical implementation guidelines: the highest coding gains come with the multi type tree (MTT) structure, adaptive loop filter (ALF), cross component linear model (CCLM), and bi-directional optical flow (BDOF) coding tools, whereas multi transform selection (MTS) and affine motion estimation are the primary candidates for complexity reduction. To the best of our knowledge, this is the first work to provide a comprehensive RDC analysis for practical VVC encoding. It can serve as a basis for practical VVC encoder implementation or optimization on various computing platforms.publishedVersionPeer reviewe

    Spatio-Temporal Parallelization Scheme for HEVC Encoding on Multi-Computer Systems

    Get PDF
    High Efficiency Video Coding (HEVC) sets the scene for economic video transmission and storage, but its inherent computational complexity calls for efficient parallelization techniques. This paper introduces and compares three different parallelization strategies for HEVC encoding on multi-computer systems: 1) spatial parallelization scheme, where input video frames are divided into slices and distributed among available computers; 2) temporal parallelization scheme, where input video is distributed among computers in groups of consecutive frames; 3) spatio-temporal parallelization scheme that combines the proposed spatial and temporal approaches. All these three schemes were benchmarked as part of the practical Kvazaar open-source HEVC encoder. Our experimental results on 2–5 computer configurations show that using the spatial scheme gives 1.65×–2.90× speedup at the cost of 4.16%–13.09% bitrate loss over a single-computer setup. The respective speedup with temporal parallelization is 1.86×–3.26× without any coding overhead. The spatio-temporal scheme with 2 slices was shown to offer the best load-balancing with 1.81×–3.55× speedups and a constant coding loss of 4.16%.acceptedVersionPeer reviewe

    Tailored AVX2 Transform Kernels for Versatile Video Coding

    Get PDF
    Transform coding tools play an integral part in video codecs due to their substantial impact on coding efficiency. The latest video coding standard, Versatile Video Coding (VVC), makes the most of these tools by introducing new DST7, DCT8, and non-square transforms alongside the conventional DCT2 transform. This paper proposes optimized AVX2 kernels for all these transforms to speed up VVC coding. Unlike existing solutions, our kernels are specially tailored for each VVC transform type and block size. Accelerating our open-source uvg266 VVC encoder with the proposed kernels yields up to a 1.1Ă— speedup under all intra (AI) coding condition without any coding overhead. Our implementations make forward DCT2 and DST7/DCT8 transforms 4.0Ă— and 6.7Ă— as fast as their respective scalar implementations in the VTM reference encoder. They also outpace the AVX2 kernels of the practical VVenC encoder by factors of 3.0Ă— and 2.8Ă—. The respective speedups rise up to 5.3Ă—, 11.1Ă—, 3.4Ă—, and 3.0Ă— with inverse transforms.Peer reviewe

    Open-source RTP Library for End-to-End Encrypted Real-Time Video Streaming Applications

    Get PDF
    Information security has become a key success factor for streaming media applications that are increasingly vulnerable to wiretapping, message forgery, data tampering, hacking, and other possible cyberattacks. This paper addresses the existing security risks in real-time video streaming by introducing a new security extension to our uvgRTP open-source Real-time Transport Protocol (RTP) library. The proposed solution improves content integrity and privacy by adopting Secure RTP (SRTP) and Zimmermann RTP (ZRTP) for media End-to-End Encryption (E2EE). These new security mechanisms make uvgRTP the first open-source library that supports on-the-fly encrypted AVC, HEVC, and VVC video streaming. Our performance results on Intel Core i7-4770 processor show that uvgRTP is able to transport encrypted 8K VVC video at up to 187 fps and 8K HEVC video at up to 120 fps over a 10 Gbps Local Area Network (LAN). The achieved transfer rate for encrypted HEVC video is 50% higher and latency 86% lower than the respective performance values of FFmpeg in unencrypted HEVC streaming. These top streaming speed results with state-of-the-art video codec support, advanced encryption mechanisms, and the permissive BSD license make uvgRTP an attractive solution for a broad range of commercial and academic streaming media applications.acceptedVersionPeer reviewe

    uvgRTP 2.0: Open-Source RTP Library For Real-Time VVC/HEVC Streaming

    Get PDF
    Real-time video transport plays a central role in various interactive and streaming media applications. This paper presents a new release of our open-source Real-time Transport Protocol (RTP) library called uvgRTP (github.com/ultravideo/uvgRTP) that is designed for economic video and audio transmission in real time. It is the first public library that comes with built-in support for modern VVC, HEVC, and AVC video codecs and Opus audio codec. It can also be tailored to diversified media formats with an easy-to-use generic API. According to our experiments, uvgRTP can stream 8K VVC video at 300 fps with an average round-trip latency of 4.9 ms over a 10 Gbit link. This cross-platform library can be run on Windows and Linux operating systems and the permissive BSD 2-Clause license makes it accessible to a broad range of commercial and academic streaming media applications.acceptedVersionPeer reviewe

    Image and Video Coding Techniques for Ultra-low Latency

    Get PDF
    The next generation of wireless networks fosters the adoption of latency-critical applications such as XR, connected industry, or autonomous driving. This survey gathers implementation aspects of different image and video coding schemes and discusses their tradeoffs. Standardized video coding technologies such as HEVC or VVC provide a high compression ratio, but their enormous complexity sets the scene for alternative approaches like still image, mezzanine, or texture compression in scenarios with tight resource or latency constraints. Regardless of the coding scheme, we found inter-device memory transfers and the lack of sub-frame coding as limitations of current full-system and software-programmable implementations.publishedVersionPeer reviewe

    Machine Learning based Efficient QT-MTT Partitioning Scheme for VVC Intra Encoders

    Full text link
    The next-generation Versatile Video Coding (VVC) standard introduces a new Multi-Type Tree (MTT) block partitioning structure that supports Binary-Tree (BT) and Ternary-Tree (TT) splits in both vertical and horizontal directions. This new approach leads to five possible splits at each block depth and thereby improves the coding efficiency of VVC over that of the preceding High Efficiency Video Coding (HEVC) standard, which only supports Quad-Tree (QT) partitioning with a single split per block depth. However, MTT also has brought a considerable impact on encoder computational complexity. In this paper, a two-stage learning-based technique is proposed to tackle the complexity overhead of MTT in VVC intra encoders. In our scheme, the input block is first processed by a Convolutional Neural Network (CNN) to predict its spatial features through a vector of probabilities describing the partition at each 4x4 edge. Subsequently, a Decision Tree (DT) model leverages this vector of spatial features to predict the most likely splits at each block. Finally, based on this prediction, only the N most likely splits are processed by the Rate-Distortion (RD) process of the encoder. In order to train our CNN and DT models on a wide range of image contents, we also propose a public VVC frame partitioning dataset based on existing image dataset encoded with the VVC reference software encoder. Our proposal relying on the top-3 configuration reaches 46.6% complexity reduction for a negligible bitrate increase of 0.86%. A top-2 configuration enables a higher complexity reduction of 69.8% for 2.57% bitrate loss. These results emphasis a better trade-off between VTM intra coding efficiency and complexity reduction compared to the state-of-the-art solutions
    • …
    corecore